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Background 

"Dangerous" gene families, defined as prone to dominant 
(gain-of-function) mutations, have been greatly expanded 
in the course of vertebrate evolution by contrast to gene 
families more prone to recessive (loss-of-function) muta- 
tions. While the maintenance of "essential" genes is 
ensured by their lethal double null mutations, the expan- 
sion of "dangerous" gene families, implicated in cancer 
and other severe genetic diseases in human, remains puz- 
zling. Could gene susceptibility to dominant deleterious 
mutations be somehow responsible for this striking evo- 
lutionary expansion of "dangerous" gene families? 

Results 

We proposed such an evolutionary model suggesting 
that this counterintuitive expansion of "dangerous" gene 
families is in fact a consequence of their susceptibility to 
deleterious mutations and purifying selection in poly- 
ploid species that arose from two rounds of whole gen- 
ome duplication (WGD) events dating back from the 
onset of jawed vertebrates, some 500MY ago [1,2]. All 
WGD duplicates, so-called "ohnologs", were thus initi- 
ally acquired by speciation without the need to provide 
evolutionary benefit to be fixed in post- WGD species. 

Our data mining analyses, based on the 20,506 human 
protein coding genes, first revealed a strong correlation 
between the retention of ohnologs and their susceptibility 
to dominant deleterious mutations in humans [3]. It 
appears that the human genes associated with the occur- 
rence of cancer and other genetic diseases (8,095) have 
retained significantly more ohnologs than expected by 
chance (48% versus 35%; 48% : 3,844/8,095; P=1.3xl0~ 128 , 
X 2 )- We also found that the retention of ohnologs is 



more strongly related to their "dangerousness" than their 
"essentiality" [3]. 

To go beyond mere correlations, we also performed 
mediation analyses, following the approach of Pearl [4], 
and quantified the direct and indirect effects of many 
genomic properties, such as essentiality, expression 
levels or divergence rates, on the retention of ohnologs. 

This enabled us to investigate an alternative hypothesis 
frequently invoked to account for the biased retention of 
ohnologs, namely the "dosage-balance" hypothesis [5]. 
While this hypothesis posits that the ohnologs are 
retained because their interactions with protein partners 
require to maintain balanced expression levels throughout 
evolution, we found that most of the ohnologs have in 
fact been eliminated from permanent complexes in 
human (7.5% versus 35%; 7.5% : 18/239; P=1.2xl0~ 18 , x 2 ). 
These mediation analyses also showed (Fig. 1) that the 
gene susceptibility to deleterious mutations is more rele- 
vant than dosage-balance for the retention of ohnologs in 
more transient complexes. 

Conclusions 

These results suggest that the retention of human ohno- 
logs is primarily caused by their susceptibility to deleter- 
ious mutations. They further establish that the retention 
of many ohnologs suspected to be dosage balanced is in 
fact indirectly mediated by their susceptibility to domi- 
nant deleterious mutations. 

All in all, this supports a new evolutionary model rely- 
ing on a non-adaptive mechanism that hinges on (i) the 
speciation event concomitant to WGD, and (ii) the dom- 
inance of deleterious mutations leading to purifying 
selection in post- WGD species. 
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A Mediation Analysis using all human protein coding genes (20,506) 
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B Mediation Analysis using human protein coding genes without SSD nor CNV (8,215) 
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Figure 1 Quantitative Mediation analysis of direct versus indirect effects of deleterious mutations and dosage balance on the retention of 
human ohnologs using (A) all human protein coding genes (20,506) or (B) human protein coding genes without SSD nor CNV (8,215). The 
thickness of the arrows outlines the relative importance of the corresponding direct or indirect effects. Dir.< 0 or lnd.<0 corresponds to an 
anticorrelated direct or indirect effect, respectively. Gene prone to deleterious mutations and/or to dosage balance (including haploinsufficient 
genes and genes involved in multiprotein complexes) are taken from [3], 
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